7 research outputs found

    The FormAI Dataset: Generative AI in Software Security Through the Lens of Formal Verification

    Full text link
    This paper presents the FormAI dataset, a large collection of 112, 000 AI-generated compilable and independent C programs with vulnerability classification. We introduce a dynamic zero-shot prompting technique constructed to spawn diverse programs utilizing Large Language Models (LLMs). The dataset is generated by GPT-3.5-turbo and comprises programs with varying levels of complexity. Some programs handle complicated tasks like network management, table games, or encryption, while others deal with simpler tasks like string manipulation. Every program is labeled with the vulnerabilities found within the source code, indicating the type, line number, and vulnerable function name. This is accomplished by employing a formal verification method using the Efficient SMT-based Bounded Model Checker (ESBMC), which uses model checking, abstract interpretation, constraint programming, and satisfiability modulo theories to reason over safety/security properties in programs. This approach definitively detects vulnerabilities and offers a formal model known as a counterexample, thus eliminating the possibility of generating false positive reports. We have associated the identified vulnerabilities with Common Weakness Enumeration (CWE) numbers. We make the source code available for the 112, 000 programs, accompanied by a separate file containing the vulnerabilities detected in each program, making the dataset ideal for training LLMs and machine learning algorithms. Our study unveiled that according to ESBMC, 51.24% of the programs generated by GPT-3.5 contained vulnerabilities, thereby presenting considerable risks to software safety and security.Comment: https://github.com/FormAI-Datase

    Data Protection Impact Assessment in Identity Control Management with a Focus on Biometrics

    No full text
    Privacy issues concerning biometric identification are becoming increasingly relevant due to their proliferation in various fields, including identity and access control management (IAM). The General Data Protection Regulation (GDPR) requires the implementation of a data protection impact assessment for privacy critical systems. In this paper, we analyse the usefulness of two different privacy impact assessment frameworks in the context of biometric data protection. We use experiences from the SWAN project that processes four different biometric characteristics for authentication purposes. The results of this comparison elucidate how useful these frameworks are in identifying sector-specific privacy risks related to IAM and biometric identification

    Deep sequencing analysis of viral short RNAs from an infected Pinot Noir grapevine

    Get PDF
    Virus-derived short interfering RNAs (vsiRNAs) isolated from grapevine V. vinifera Pinot Noir clone ENTAV 115 were analyzed by high-throughput sequencing using the Illumina Solexa platform. We identified and characterized vsiRNAs derived from grapevine field plants naturally infected with different viruses belonging to the genera Foveavirus, Maculavirus, Marafivirus and Nepovirus. These vsiRNAs were mainly of 21 and 22 nucleotides (nt) in size and were discontinuously distributed throughout Grapevine rupestris stem-pitting associated virus (GRSPaV) and Grapevine fleck virus (GFkV) genomic RNAs. Among the studied viruses, GRSPaV and GFkV vsiRNAs had a 5' terminal nucleotide bias, which differed from that described for experimental viral infections in Arabidopsis thaliana. VsiRNAs were found to originate from both genomic and antigenomic GRSPaV RNA strands, whereas with the grapevine tymoviruses GFkV and Grapevine Red Globe associated virus (GRGV), the large majority derived from the antigenomic viral strand, a feature never observed in other plant–virus interactions

    NGS of Virus-Derived Small RNAs as a Diagnostic Method Used to Determine Viromes of Hungarian Vineyards

    No full text
    As virus diseases cannot be controlled by traditional plant protection methods, the risk of their spread have to be minimized on vegetatively propagated plants, such as grapevine. Metagenomic approaches used for virus diagnostics offer a unique opportunity to reveal the presence of all viral pathogens in the investigated plant, which is why their application can reduce the risk of using infected material for a new plantation. Here we used a special branch, deep sequencing of virus-derived small RNAs, of this high-throughput method for virus diagnostics, and determined viromes of vineyards in Hungary. With NGS of virus-derived small RNAs we could detect not only the viruses tested routinely, but also new ones, which had never been described in Hungary before. Virus presence did not correlate with the age of the plantation, moreover phylogenetic analysis of the identified virus isolates suggests that infections are mostly caused by the use of infected propagating material. Our results, validated by other molecular methods, raised further questions to be answered before this method can be introduced as a routine, reliable test for grapevine virus diagnostics

    Deep sequencing of viroid-derived small RNAs from grapevine provides new insights on the role of RNA silencing in plant-viroid interaction

    Get PDF
    Viroids are circular, highly structured, non-protein-coding RNAs that, usurping cellular enzymes and escaping host defense mechanisms, are able to replicate and move through infected plants. Similarly to viruses, viroid infections are associated with the accumulation of viroid-derived 21–24 nt small RNAs (vd-sRNAs) with the typical features of the small interfering RNAs characteristic of RNA silencing, a sequence-specific mechanism involved in defense against invading nucleic acids and in regulation of gene expression in most eukaryotic organisms. (GYSVd1) were sequenced by the high-throughput platform Solexa-Illumina, and the vd-sRNAs were analyzed. The large majority of HSVd- and GYSVd1-sRNAs derived from a few specific regions (hotspots) of the genomic (+) and (−) viroid RNAs, with a prevalence of those from the (−) strands of both viroids. When grouped according to their sizes, vd-sRNAs always assumed a distribution with prominent 21-, 22- and 24-nt peaks, which, interestingly, mapped at the same hotspots.These findings show that different Dicer-like enzymes (DCLs) target viroid RNAs, preferentially accessing to the same viroid domains. Interestingly, our results also suggest that viroid RNAs may interact with host enzymes involved in the RNA-directed DNA methylation pathway, indicating more complex scenarios than previously thought for both vd-sRNAs genesis and possible interference with host gene expression
    corecore